A concatenative Mandarin TTS system without prosody model and prosody modification

نویسندگان

  • Min Chu
  • Hu Peng
  • Eric Chang
چکیده

This paper proposes a two-step solution for generating natural prosody in TTS, in which no prosody prediction and modification are needed. A large phonetically and prosodically enriched speech corpus has been collected as the unit pool for the synthesizer. A multi-tier non-uniform unit selection scheme is developed to pick up the most suitable segments for concatenation from the unit pool. Final decisions for all units in the utterance to be synthesized are made by minimizing the overall concatenative cost of the whole utterance. Result from a subjective evaluation shows that the average concatenative cost of a synthesized utterance is highly correlated with its naturalness.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modular Design for Mandarin Text-to-speech Synthesis

In the European Union funded project Technology and Corpora for Speech-to-Speech Translation (TC-STAR) [3], we have developed a modular concatenative TTS system for Mandarin Chinese. A common architecture has been introduced based on well-defined modules and interfaces. Three main modules, text processing, prosody processing and acoustic synthesis modules, are used following a commonly employed...

متن کامل

Improving intelligibility of synthesized speech in noise with emphasized prosody

The performance of current high quality concatenative text-to-speech (TTS) systems is limited under noisy environments. This paper investigates whether or not the intelligibility of synthesized speech in noise can be improved by emphasizing the prosody. Additionally, the paper presents a method that can effectively emphasize the prosody of units in existing TTS databases. The circular linear pr...

متن کامل

Implementing an SSML compliant concatenative TTS system

The W3C Speech Synthesis Markup Language (SSML) unifies a number of recent related markup languages that have emerged to fill the perceived need for increased, and standardized, user control over Text to Speech (TTS) engines. One of the main drivers for markup has been the increasing use of TTS engines as embedded components of specific applications – which means they are in a position to take ...

متن کامل

Perceptually based automatic prosody labeling and prosodically enriched unit selection improve concatenative text-to-speech synthesis

Prosody is an important factor in the quality of text-tospeech (TTS) synthesis. Typically, acoustic parameters such as f0 and duration are the only variables related to prosody that are used to determine unit selection. Our study explored adding the explicit use of linguistically and perceptually motivated prosodic categories in unit selection-based TTS. One of our goals was to automate the pro...

متن کامل

A Novel Prosody Adaptation Method for Mandarin Concatenation- Based Text-to-speech System

The paper presents a prosody adaptation method which is able to adapt the prosody model of text to speech (TTS) to a new style with a small training corpus. Unlike the conventional prosody mapping between two parallel prosody features, the paper tries to integrate the prosody conversion into the prosody generation model of TTS. In the paper, we use a template based prosody model which consists ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001